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An important component of pathogen evolution at the population level is evolution within hosts. Unless evolution within hosts is 
very slow compared to the duration of infection, the composition of pathogen genotypes within a host is likely to change during 
the course of an infection, thus altering the composition of genotypes available for transmission as infection progresses. We 
develop a nested modeling approach that allows us to follow the evolution of pathogens at the epidemiological level by explicitly 
considering within-host evolutionary dynamics of multiple competing strains and the timing of transmission. We use the framework 
to investigate the impact of short-sighted within-host evolution on the evolution of virulence of human immunodeficiency virus 
(HIV), and find that the topology of the within-host adaptive landscape determines how virulence evolves at the epidemiological 
level. If viral reproduction rates increase significantly during the course of infection, the viral population will evolve a high level 
of virulence even though this will reduce the transmission potential of the virus. However, if reproduction rates increase more 
modestly, as data suggest, our model predicts that HIV virulence will be only marginally higher than the level that maximizes the 
transmission potential of the virus. 
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Understanding how pathogens evolve at the epidemiological level 
is vital if we are going to accurately assess how epidemics and 
pandemics are likely to progress, and what the consequences of 
biomedical and other interventions are likely to be. Important 
components of pathogen evolution at the population level are the 
ecological and evolutionary processes that occur during the course 
of an infection. As a consequence, pathogens can face conflicting 
evolutionary pressures because traits that maximize the within- 
host fitness of a pathogen strain might reduce its between-host 
fitness. This conflict will be particularly strong when multiple 
pathogen strains persist in a single host, either due to infection by 
multiple strains or by the generation of multiple strains through 
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mutation. Here, we develop a nested modeling framework that al- 
lows us to follow the evolution of pathogens at the epidemiological 
level, and find equilibrium values, by explicitly considering the 
within-host evolutionary dynamics of multiple competing strains 
and the timing of transmission. We use the framework to assess 
the impact of within-host processes on the evolution of virulence 
of human immunodeficiency virus (HIV) at the epidemiological 
level, although the approach could be applied to a number of 
host-pathogen systems. 

The relatively stable set-point viral load (SPVL) of HIV ob- 
served during chronic asymptomatic infection is a commonly used 
proxy for virulence (Miiller et al. 201 1). A high SPVL increases 
the probability that virus will be transmitted, but also hastens 
the onset of AIDS and eventual death, thus reducing the period 
during which the virus can be transmitted (Mellors et al. 1996; 
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De Wolf et al. 1997; Korenromp et al. 2009). SPVL is at least 
partly heritable from infection to infection (Tang et al. 2004; Al- 
izon et al. 2010; Hecht et al. 2010; Hollingsworth et al. 2010; 
van der Kuyl et al. 2010), and it has recently been shown that 
HIV seems to have evolved an intermediate level of virulence that 
maximizes the number of potential new infections from a single 
infection, known as its transmission potential (Fraser et al. 2007; 
Shirreff et al. 2011), thus maximizing the between-host fitness 
of the virus. With a higher level of virulence, the virus is more 
likely to be transmitted while the infection lasts, but death due to 
AIDS will be swifter, resulting in fewer onward infections during 
the lifetime of the infected individual. With a lower level of viru- 
lence the host will live longer, but the rate of onward transmission 
will be lowered, again reducing the number of onward infections 
during the lifetime of the infected individual. 

However, the evolution of an intermediate level of virulence 
that maximizes transmission potential sits uncomfortably with the 
concept of short-sighted evolution (Levin and Bull 1994; Frank 
20 1 2). During the course of long-term infections we should expect 
strains with a competitive advantage to sweep through the within- 
host population if and when they arise, regardless of whether this 
reduces the transmission potential of the current or subsequent 
infections. Evolution is "short-sighted" because what is good for 
the virus in the short-term within the host is not necessarily what is 
good for the virus in the longer-term at the epidemiological level. 
This is analogous to the concept of the tragedy of the commons 
seen in models of social evolution (Rankin et al. 2007). 

There is good reason to believe that the evolution of HIV 
should be short-sighted. Infection with a strain of HIV that has a 
high (low) replicative capacity is likely to result in an infection 
with a high (low) viral load (Quinones-Mateu et al. 2000; Trkola 
et al. 2003; Daar et al. 2005; Joos et al. 2005; Kouyos et al. 
2011). In other words, the replicative capacity of a viral strain 
is correlated with virulence. HIV can evolve extremely quickly 
during the course of infection (Shankarappa et al. 1999; Lemey 
et al. 2007), on the face of it giving the virus ample opportunity 
to produce strains with a high replicative capacity and for these 
to sweep through the within-host population. Evidence suggests 
that the replicative capacity of HIV does indeed tend to increase 
during the course of an infection (Troyer et al. 2005; Kouyos et al. 
201 1). As a consequence, if within-host fitness is correlated with 
virulence, we might expect the virulence of HIV to be relatively 
high as a result of short-sighted evolution, even if this is not the 
best strategy for the virus at the epidemiological level. 

To better understand the conflicting evolutionary pressures 
influencing the evolution of HIV virulence, we have constructed 
a nested modeling framework that integrates within-host evolu- 
tionary dynamics and between-host dynamics, and that allows 
a large number of strains to coexist within a host at any one 
time. A growing number of models have been developed enabling 



us to link within-host and between-host dynamics (examples in- 
clude: Sasaki and Iwasa 1991; Gilchrist et al. 2002; Andre and 
Gandon 2006; Gilchrist and Coombs 2006; Coombs et al. 2007; 
Alizon and van Baalen 2008; Luciani and Alizon 2009; Feng et al. 
201 1; Saenz and Bonhoeffer 2013), but apart from the individual- 
based HCV simulation study of Luciani and Alizon (2009), none 
have considered more than two strains coexisting within a host 
at any one time. Incorporating multiple strains into our models, 
when they affect phenotype and when they are likely to coex- 
ist within hosts, is important because otherwise it is impossible 
to adequately assess the consequences of within-host ecologi- 
cal and evolutionary processes on the epidemiological dynamics 
of pathogens. Here, we explicitly incorporate the evolution of 
pathogens through the course of infection, allowing for differen- 
tial transmission of pathogen strains depending on their frequency 
within the host at the time of transmission, the intensity of the 
infection, and the inherent transmissibility of the strain. Unlike 
other nested models, we model the within-host dynamics using a 
quasi-species approximation, meaning that the frequency of the 
different strains within the host are determined by their reproduc- 
tion rates and by the probability of mutation from one strain to 
another at the time of replication. The frequencies are then multi- 
plied by empirically determined infectivity profiles to express the 
viral load during the course of infection, and the duration of infec- 
tion as a function of the virulence of the infecting strain (Fraser 
et al. 2007; Shirreff et al. 2011). This allows for very efficient 
computation of the within-host dynamics. However, if desired, a 
more mechanistic model could be used instead. 

Crucially, we find that by changing the shape of the within- 
host adaptive landscape and the intensity of within-host and 
between-host competition, we can reach qualitatively different 
predictions, including whether one or multiple strains persist 
within the viral population, and whether the virus evolves toward 
low, intermediate, or high levels of virulence at the population 
level. Indeed, in some cases the prevalence gets to such low levels 
that the viral population effectively drives itself to extinction. In 
all cases, as the intensity of within-host competition increases, 
the fitness of the viral population at the epidemiological level, as 
measured by the basic reproduction number Rq, decreases. 

Data suggest that the replicative capacity of HIV increases 
during the course of infections, but by a relatively modest amount 
compared to the variance in replicative capacities observed across 
patients (Troyer et al. 2005; Kouyos et al. 201 1). In this situation, 
our model predicts that between-host processes will overshadow 
within-host processes and therefore HIV will evolve a level of vir- 
ulence that is similar to the level that maximizes Rq. If interpreted 
in terms of within- versus between-group selection (where here a 
group is the collection of viruses within an individual), the result 
agrees with our intuition from standard theory. That is, when vari- 
ation among groups is much larger than variation within groups, 
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selection at the between-group level will overshadow selection at 
the within-group level (see Frank 2012 for a further details). 

Although here we apply our framework to consider the prob- 
lem of virulence in HIV, it is a general framework that could be 
applied to a large number of host-pathogen systems in which in- 
dividual hosts are coinfected by multiple strains, or where new 
strains arise frequently as a consequence of mutation. Importantly, 
this modeling framework is extremely flexible. For example, it can 
be used in conjunction with an explicit model of within-host evo- 
lution, or as implemented here, it can incorporate a more generic 
quasi-species model of within-host evolution combined with em- 
pirically determined infectivity profiles. 

Methods 

We construct our nested model by linking the within-host evolu- 
tionary dynamics of HIV with between-host dynamics, describ- 
ing the spread of the virus in an exposed human population. The 
between-host modeling framework we use is based on the well- 
developed theory of multitype epidemic models (Diekmann and 
Heesterbeek 2000). In this framework, individuals are divided into 
types based on their epidemiological characteristics. We assume 
that once a host is infected the course of infection is only driven 
by within-host dynamics and is not influenced by other factors. 
This is a common assumption when dealing with microparasites, 
and its main consequence is that we can rely on standard epidemi- 
ological theory for the spread of the infection at the between-host 
level. However, it also means that we ignore effects such as super- 
infection. This assumption should be revisited once the dynamics 
of superinfection are more clearly understood. For example, it 
has recently been shown that superinfection is much more com- 
mon than was previously thought, although it is unclear how this 
influences the course of infection (Redd et al. 2012b). 

WITHIN-HOST MODEL 

It is now well known that most new heterosexual HIV infections 
are established by a single virion (Keele et al. 2008), and therefore 
for simplicity we assume that all new infections are established 
from a single viral strain. We also assume that host factors do not 
influence the course of the infection, so that all susceptible indi- 
viduals are identical. Because only within-host factors influence 
infection, the course of infection is uniquely determined by the 
strain initiating it, and therefore we can identify infected individ- 
uals by their type, where a type-y individual is someone initially 
infected with strain /. The assumption of infection by a single 
strain and identical susceptible hosts can easily be relaxed at the 
price of increasing the number of types. 

Because of mutation and subsequent competition within in- 
dividuals, someone infected by strain j can infect a susceptible 



person with strain /. In a fully susceptible population, a type-j 
individual will generate type-/ individuals through transmission 
at a time-dependent rate p\/(t), where x is the time since the type- 
j individual was infected (x is often referred to as the age of 
infection). 

The within-host component of the model is used to charac- 
terize the strain-specific infectivity profile, $y(x) for all infection 
types. 

The strain-specific infectivity profile of a type-y infection 
could be determined using a mechanistic within-host model, de- 
scribing the interaction of the virus with the host's immune system 
(e.g., Coombs et al. 2007). However, simple competition models 
cannot reproduce the complex profiles of time-varying infectiv- 
ity characteristic of HIV, and therefore we use a more pragmatic 
approach. By using available data, we define an overall time- 
varying infectivity profile a,(t) for a type-y individual, and model 
the change in frequencies of each strain, Xy(x), within each type 
of host using the reproduction-mutation quasi-species equation 
(Nowak 2006). Here Xy(x) represents the frequency of strain / in a 
type-y host at time x since infection. We finally assume that strains 
can differ relative to each other in how efficiently they are trans- 
mitted between hosts, and denote the relative between-host trans- 
missibility of strain i by G, . Combining all these elements, we ob- 
tain the general equation for the strain-specific infectivity profile: 

p ij (x) = G i x ij (x)a J (x). (1) 

To define the overall infectivity profile, a,(t), for a type-j 
individual, we follow Shirreff et al. 2011 (see also Saenz and 
Bonhoeffer 2013). The overall infectivity profile of HIV is as- 
sumed to have three stages: primary, asymptomatic, and AIDS. 
The duration of infection and the infectivity of the primary and 
AIDS stages are assumed to be equal for all infections, but the 
more virulent the infecting strain, the greater the infectivity and 
the shorter the duration of the asymptomatic stage, and therefore 
the shorter the duration of the entire infection, Tj (see fig. 2 of 
Fraser et al. 2007 and Shirreff et al. 201 1 for further details and 
the precise formalization). 

To calculate the frequency of strain in a type-j host, Xy(x), 
we use the reproduction-mutation quasi-species equation, as fol- 
lows: let y = (yi) be the (column) vector of the number of virions of 
each strain within a host in an unbounded reproduction-mutation 
system. Also, let M = (my) be the mutation matrix, where my is 
the probability that the progeny of a strain-y virion is a strain-; 
virion. Next, let g, be the reproduction rate of strain Finally, let 
Q = (qy) = (mygf) be the reproduction-mutation matrix. Then 
the unbounded reproduction-mutation system is described by the 
equation: 

J = Qy (2) 

d? 
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with initial condition y(0) = y° and solution y(t) = e^ T y°, where 
we have made use of matrix exponentiation. Denoting the total 
number of virions by y = E; y t , the system for the frequencies 
x = (.Xj) = y/y is given by the quasi-species equation: 



dx 
dt 



Gx - gx, 



(3) 



where g = g(x) = E; gjXj(t). The second term in equation (3) 
ensures that the strain frequencies always sum to 1. Given the 
initial condition x(0) = x° = y°/y, the ith element of the solution 
of equation (3), x,(t), can be written in the form (Domingo et al. 
2008): 



Xi(x) : 



y,(T) (e Qx y°), 



y(x) E, (e e V)/ E, ( eetx °)/ ' 



(4) 



where (u),- represents the ith element of vector v. The last equa- 
tion follows because the solution does not change if the initial 
condition is multiplied by any arbitrary constant. 

So far, we have let the viral population grow unbounded. 
However, if we assume that virion death, for example due to the 
immune system, limits the viral load, and that the death rate is 
strain independent, the above equation still holds. This is, ad- 
mittedly, a very strong assumption, but is essential for analytical 
tractability. For the same reason of tractability, we also assume 
that the reproduction rates and mutation probabilities of the dif- 
ferent strains remain constant throughout an infection. The death 
rate instead is allowed to vary during the course of the infection, 
and this is assumed to happen in such a way that the total viral load 
becomes consistent with the required overall infectivity profile. 

If the infection starts with a single virion of strain j, then we 
can write x° = e ; , where e / denotes the column vector with all 0 
elements, except for a single 1 in position j. It follows that 



Xij(x) ■ 



(e Gt e y -)i 



E,(e^e y ), Ei^ Qx h 



(5) 



Because Xy(x) is the frequency of strain i virions in a type-y 
host at time x since infection, (5) can be substituted into equation 
(1) to calculate the strain-specific infectivity profile. 

BETWEEN-HOST MODEL 

We next describe the between-host model. All individuals have 
a natural mortality rate |x and, in addition, type-j individuals 
are assumed to die at time Tj after infection, assuming they 
have not already succumbed to natural mortality. As described in 
the previous section, the duration of infection is only affected 
by the duration of asymptomatic infection, which in turn is de- 
termined by the virulence of the infecting strain (Shirreff et al. 
201 1). Denoting the time elapsed since the beginning of the epi- 
demic by t , the force of infection of strain i at time t due to a type-y 
individual infected at time t — x is p > y(T)e _M,T for x < Tj and 0 



for x > Tj. The factor e~^ x represents the fraction of individuals 
infected at time t — x, which have survived up to time t . 

Assuming random mixing, if the population is not fully 
susceptible, only a fraction S(t)/ N(t) of infectious attempts will 
result in a real infection, where S(t) is the number of suscep- 
tible individuals at time t and N(t) is the total population size 
at time t. Denoting the incidence of type-/ cases at time t by 
Hj(t), where incidence is defined as the rate of new infections, 
and assuming that individuals enter the exposed population with 
constant overall birth rate B, the epidemiological dynamics are 
given by 

H,(t) = 1 ' fc/CW - t)e"^dT, (6.1) 



./=' 



S(t) = N(t) - 



. , Jo 



(t - x^-^dx, 



dN(t) 
dt 



B-v.NM-J^HM-Ttyer 



V.T, 



(6.2) 



(6.3) 



The underlying epidemiological model used here is a 
susceptible-infected (SI) model with demography. The quantity 
Ij(t) = Jq' Hj(t — T,)e _|1T dT appearing in the second line of equa- 
tion (6.2) represents the total number of type-/ individuals at 
time /. Using a star to denote quantities at equilibrium, we find 
that 



H* 



N* 



Em/;. 



H*(l 



E 



B - E H*e-» T > 



-v-Tr 



N* = 



(7.1) 



(7.2) 



(7.3) 



where K = (fey) = (L 1 Py(t)e _M,T dt) is the so-called next- 
generation matrix (Diekmann and Heesterbeek 2000). Each el- 
ement fey represents the average number of type-/ infections 
generated in a fully susceptible population by a type-j individ- 
ual during their entire infectious life. Standard epidemiological 
theory (Diekmann and Heesterbeek 2000) allows fast analytical 
computation of all quantities at equilibrium, thanks to the use of 
the next-generation matrix as follows. Perron-Frobenius theory 
of positive matrices assures that K has a unique (real) dominant 
eigenvalue, which represents the correct definition of the basic 
reproduction number Rq (see Diekmann and Heesterbeek 2000, 
ch. 5.1). Also, from equation (7.1), the vector (H*) representing 
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the incidence of each type at equilibrium is an eigenvector of K 
relative to Rq. Furthermore, Rq represents the average number of 
new infections a typical infected individual generates in a fully 
susceptible population (see Diekmann and Heesterbeek 2000, 
p. 95, for the precise meaning of the word "typical"). At equi- 
librium, the number of infected individuals does not change, so 
that on average each infective must generate one new infection 
before dying. Therefore, the condition for equilibrium is 



S* 
N* 



1 

Ro' 



(8) 



By denoting the total incidence at equilibrium by H*, and the 
eigenvector of K relative to Rq, normalized to have components 
summingto 1, by (//,), we have//* = //*//,. With some algebraic 
manipulation, we find that, at equilibrium: 



H* 



B(R 0 - 1) 



(9) 



Because we can derive Rq and (//,) from K and H* from 
equation (9), we can calculate all quantities at equilibrium from 
Equations (7.1)-(7.3). As already noted, here we have used an SI 
model with demography, as it applies to the case of HIV, but in 
principle any between-host model structure could be used, as long 
as it can be described using a next-generation matrix formalism. 



Results 



We are interested in how within-host evolutionary dynamics affect 
the evolution of virulence at the epidemiological level. We start 
by considering the within-host dynamics of infection, and then 
look at the epidemiological dynamics, before finally analyzing 
how the system behaves at equilibrium. 

WITHIN-HOST DYNAMICS 

We consider n strains, indexed with i = 1, 2, . .., n, and we as- 
sume the higher the index of the strain initiating an infection, the 
higher the SPVL and therefore the more virulent the infection. 
Following Shirreff et al. (201 1), we assume the viral loads of the 
strains are evenly distributed on a log scale, with infection by the 
least virulent strain resulting in a SPVL of 1 x 10 2 viral particles 
per milliliter, and infection by the most virulent strain a SPVL 
of 1 x 10 7 viral particles per milliliter. In addition, we define the 
within-host fitness of strain i as its reproduction rate g, and as- 
sume that, for increasing i, such rates are evenly distributed on a 
linear scale, with the least virulent strain also having the lowest 
fitness. Therefore, within-host fitness and virulence are positively 
correlated. Here, g\ = g m \ n = 1 per day throughout, but the value 
of g n = gmax varies. 

The HIV within-host fitness landscape appears to be incredi- 
bly complex (Kouyos et al. 2012). To gain an understanding of the 
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Figure 1. Representation of the three within-host fitness land- 
scapes considered. (A) Flat fitness landscape. (B) Hill-climb fitness 
landscape. (C) Rugged fitness landscape. Note that, for the rugged 
landscape, lethal fitness valleys separate the strains (not shown) 
and therefore double mutations are required to move from one 
part of the landscape to another. Strains are represented by cir- 
cles, and the higher the strain index, the more virulent the virus 
(strain 1, dark blue; strain 2, green; strain 3, orange; strain 4, red). 
The virulence of the strain initiating an infection determines the 
overall infectivity profile of the infection (ay) and the duration of 
the infection (Tj). Dark blue arrows represent a mutation prob- 
ability of 5 x 10~ 5 per replication. Light blue arrows represent a 
mutation probability of 2.5 x 10~ 9 per replication. 

role the landscape has in our model, we consider three idealized 
within-host fitness landscapes (Fig. 1). The first is a flat fitness 
landscape where g^ = g max , that is all strains have equal within 
host fitness, and where strain / mutates into strains + 1 or — 1 
with probability 5 x 10~ 5 per replication (Mansky and Temin 
1995; Gao et al. 2004). The second is a traditional hill-climb fit- 
ness landscape where the strains have different within-host repro- 
duction rates and where strain i mutates into strains i + 1 or — 1 



EVOLUTION OCTOBER 2013 



2773 



KATRINA A. LYTHGOE ET AL. 



Type 1 infection 



Type 2 infection 



:> 0.1 
-t— « 

o 

CD 

— 0.01 



0.001 

B 1 

0.8 

>. 
O 

£Z 0.6 

CD 

13 

CT 0.4 



CD 



0.2 



Type 3 infection 
Jl 



Type 4 infection 



Ln 



Total 

infectivity 

Strain 1 

Strain 2 

Strain 3 

Strain 4 



10 



15 



20 



10 15 20 0 E 

Time (years) 



10 



15 



20 



10 



15 



20 



Figure 2. Within-host dynamics for the four-strain model for different within-host fitness landscapes. (A) Infectivity profile ay(x) of 
type-/ individuals (y = 1, 2, 3, 4). This is the rate (per year) at which new infections are made in a fully susceptible homogeneously mixing 
population during the course of infection. The type of an infected individual is defined by the strain initiating the infection. The least 
virulent strain is strain 1, and the most virulent is strain 4. The duration and infectiousness of the primary and AIDS stages of infection 
are the same for all infected individuals, but the more virulent the initiating strain, the shorter and more infectious is the asymptomatic 
stage of infection. (B) Relative frequencies of the four strains, by infection type, for a flat within-host fitness landscape. All strains have 
the same replicative capacity. (C) Relative frequencies of the four strains, by infection type, for a hill-climb fitness landscape. Strain 1 
has the lowest reproduction rate and strain 4 has the highest reproduction rate (gi = 1 per day and g 4 = 1.025 per day). (D) Relative 
frequencies of the four strains, by infection type, for a rugged fitness landscape. Strain 1 has the lowest reproduction rate and strain 4 
has the highest reproduction rate (gi — 1 per day and g 4 — 1.025 per day). 



with probability 5 x 10~ 5 per replication. The third we refer to 
as a rugged landscape in which strains have different within-host 
reproduction rates, all have equal probability of mutating to any 
other strain, but where the strains are separated by a lethal fitness 
valley: only virions harboring a double mutation can cross the 
fitness valley, leading to a mutation probability between strains 
of 2.5 x 10 -9 per replication. Because of the lethal fitness val- 



leys separating the viable genotypes, this landscape can also be 
thought of as a "holey" fitness landscape that incorporates some 
features of the hill-climb fitness landscape. 

We can see from Figure 2A that individuals infected with a 
more virulent strain will be more infectious during asymptomatic 
infection, but that the duration of asymptomatic infection will 
also be shorter. Because the infectivity profile of the infection, 
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Table 1. Variables and parameters used in the model. 



Variables 


Definition 


Values 


H,(t),H* 


Incidence (defined as number of new cases per year), of type-;' 
infections at time t and at equilibrium 




I, it) 


Total number of type-/ individuals at time t 




S(t), S* 


Number of susceptible hosts at time t and at equilibrium 




N(t),N* 


Total number of hosts at time t and at equilibrium 




Xij(x) 


Frequency of strain /' in a type-y individual at time x since infection 




Py(i) 


Strain-specific infectivity profile: the rate at which type-y individuals 
transmit strain ;' at time x since infection in a fully susceptible 
population 




K = (kij) 


Next generation matrix. Each element £y represents the average 
number of type-/ infections generated in fully susceptible 
population by a type-y individual during their entire infectious life 




Ro 


Basic reproduction rate of the viral population at equilibrium 




Parameters 


B 


Rate at which individuals enter the exposed population 


200 per year 


M- 


Host natural per-capita mortality rate 


0.02 per year 


m ( j 


Probability of strain j mutating into strain / during replication 


0.5 xlO" 5 or2.5xl0" 9 


gi 


Within-host replication rate of strain / 


Variable 


t ?mini £max 


Minimum and maximum within host reproduction rates 


l, variable 


Q={qn)={m ij gj) 


Reproduction-mutation matrix 


Variable 


aj(x) 


Overall infectivity profile of type-y individuals as a function of time t 
since infection 


Variable 


Ti 


Duration of a type-/ infection from time of infection to AIDS related 
death 


Variable 


G t 


Relative between-host transmissibility of strain / 


Variable 


G"min> (-'max 


Relative minimum and maximum between-host transmissibility 


l, 1 or 5 



0Lj(x), is only determined by the genotype of the infecting strain, 
all type-/ infections have the same infectivity profile regardless 
of the shape of the within-host fitness landscape. If we assume 
a flat fitness landscape with four strains, the strain initiating the 
infection tends to dominate the within-host dynamics, although 
gradually the other strains reach appreciable frequencies as they 
approach mutation-selection balance (Fig. 2B). If the strains have 
unequal fitnesses at the within-host level, the strain with the high- 
est within-host reproduction rate will increase in frequency as the 
infection progresses (Fig. 2C, D). In general, the larger the fitness 
difference between any pair of strains, the faster the fitter strain 
outcompetes the less fit strain. This is evident when we look at the 
rugged fitness landscape (Fig. 2D), where the fact that any strain 
can mutate in any other strain puts the fittest and the starting strains 
in direct competition with each other, resulting in faster dynamics 
when the starting strain reproduces relatively slowly (Fig. 2D). 
With a smooth hill-climb fitness landscape, the situation is more 
complex because strains that have a slow rate of reproduction can- 
not mutate directly into rapidly reproducing strains, but instead 
need to traverse the fitness landscape through the generation of all 
intermediate strains, and this process slows down the dynamics 



(Fig. 2C). Our results suggest that the effects of these two con- 
flicting factors cancel each other out to a good approximation. 
Apart from when the starting strain is very unfit, the rugged fit- 
ness landscape exhibits slower within-host dynamics compared to 
the hill-climb landscape because of the reduced mutation rate be- 
tween viable strains. For a fixed number of strains, widening their 
fitness range results in faster within-host dynamics. However, 
keeping the range fixed and increasing the number of strains re- 
duces the fitness differences between adjacently numbered strains, 
and tends to slow down the within-host dynamics (not shown). 

EPIDEMIOLOGICAL DYNAMICS 

Nesting the within-host model into the epidemiological model, 
we can see how the within-host dynamics influence the evolu- 
tionary epidemiology of the virus. The equations describing the 
dynamical population model (equations 6.1-6.3) were solved nu- 
merically using the basic Euler forward method and programmed 
independently in two mathematical packages, Mathematica (KL) 
and Matlab (LP), enabling us to cross-validate the results. In all 
simulations, the initial population size, N, is 10,000, and all indi- 
viduals are susceptible except for one individual infected by the 
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Figure 3. Epidemiological dynamics for the 8-strain model for different within-host fitness landscapes. In all numerical integrations the 
initial host population size is 10,000, the epidemic is initiated by a single individual infected with the least virulent strain, and all eight 
strains are equally transmissible (G; — 1 for all /). The first column shows the total host population size and the total prevalence of 
infection and the second column shows the proportion of infected individuals by infection type. (A) Flat within-host fitness landscape; 
all strains have the same within-host fitness. (B) Hill-climb fitness landscape where the fittest strain at the within-host level has a 2.5% 
fitness advantage over the least-fit strain (gi = 1 per day and gg = 1.025 per day). (C) Rugged fitness landscape where the fittest strain 
at the within host level has a 2.5% fitness advantage over the least-fit strain (gi = 1 per day and gg = 1.025 per day). 



least virulent strain. A list of parameters and variables are given 
in Table 1 . 

If we assume a flat within-host fitness landscape, the strain 
with the highest transmission potential (e.g., strain 5 in an eight 
strain scenario) rapidly becomes the most prevalent strain within 



the population, with the dynamics stabilizing after about 90 years 
from the start of the epidemic (Fig. 3A). This is in line with the 
conclusion reached by Shirreff et al. in their between-host model 
of HIV virulence evolution, even though their model did not in- 
clude within-host evolution and instead included mutation at the 
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Figure 4. Equilibria for the 16-strain model for different within-host fitness landscapes and where all strains are equally transmissible 
(6/ = 1 for all /). The left-hand column shows the prevalence of infections (z-axis, vertical) by infection type (x-axis) given the value of 
Sfrnax = 9i6 (y-axis). The right-hand column shows the mean SPVL at equilibrium (top) and f?o (bottom), given the intensity of within-host 
competition (i.e., the value of g max )- Type-1 infections (i.e., initiated by the least virulent strain) are dark blue and type 16 infections 
are dark red. (A) Hill-climb within-host fitness landscape. (B) Rugged within-host fitness landscape. We can see that, as the intensity of 
within-host competition increases, more virulent strains dominate the population at the epidemiological level even though this reduces 
the fitness (R 0 ) of the viral population. 



time of transmission (Shirreff et al. 2011). For both the Shirreff 
model, and our model with a flat within-host fitness landscape, 
between-host processes drive the model and the dominant strain 
is the one that has the highest transmission potential. Once we 
assume a hill climb (Fig. 3B) or a rugged (Fig. 3C) within-host 
fitness landscape, more virulent strains dominate the population. 
Strains that are fittest at the within-host level outcompete strains 
that are fitter at the between-host level because of short-sighted 
evolution. The faster the within-host dynamics, the greater the 



influence within-host processes have on the epidemiological dy- 
namics and the more myopic the short-sighted evolution. For 
example, where the within-host fitness landscape is a smooth 
hill-climb, the within-host dynamics are relatively fast (Fig. 2C) 
and a highly virulent strain dominates the population (Fig. 3B). 
Where the fitness landscape is more rugged, the within-host dy- 
namics tend to be relatively slow (Fig. 2D) and a less virulent 
strain dominates the population (Fig. 3C). Allowing the composi- 
tion of the viral population within the host at any particular time 
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Figure 5. Equilibria for the 16-strain model with a hill-climb within-host fitness landscape and where strains are not equally transmissible. 
The left-hand column shows the prevalence of infections (z-axis, vertical) by infection type (x-axis) given the value of gfoax = Sfi6 (y-axis). 
The right-hand column shows the mean SPVL at equilibrium (top) and R 0 (bottom), given the intensity of with-host competition (i.e., the 
value of gmax). Type-1 infections (i.e., initiated by the least virulent strain) are dark blue and type 16 infections are dark red. (A) The fittest 
strain at the within-host level is also the most transmissible (G+, with G mm = 1, G max = 5, see equation (10)). (B) The least-fit strain at 
the within host level is the most transmissible (Gj~, with G m i n = 1, G max = 5, see equation (11)). Where we see the least-fit strain at the 
with-host level dominating at the epidemiological level, the equilibrium is unstable and the system exhibits oscillatory dynamics (see 
main text and Fig. 6). 



to influence the overall infectivity profile of the host (rather than 
only the viral strain initiating the infection) has only a minor effect 
on the epidemiological dynamics (not shown). 

EQUILIBRIA 

To get a good understanding of the behavior of the model it is 
helpful to consider the system's equilibria. Assuming all strains 
are equally transmissible, increasing the difference in the within- 
host replication rate between the strains (i.e., increasing g max ) 
results in more virulent strains dominating the population at the 
epidemiological scale (Fig. 4). This is because, as competition at 



the within-host level is intensified, within-host processes domi- 
nate over between-host processes, and as a result evolution be- 
comes more short-sighted. This becomes evident if we consider 
the Rq of the viral population, which falls as within-host com- 
petition is intensified (Fig. 4A, B, bottom right panels). Having 
a more rugged fitness landscape (Fig. 4B) slows the within-host 
dynamics, meaning less virulent strains tend to dominate at the 
epidemiological level compared to when the within-host dynam- 
ics are much faster (Fig. 4A). Increasing the number of strains 
in the system slightly increases the Rq at equilibrium because 
adding more strains tends to slow the within-host dynamics (not 
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Figure 6. Epidemiological dynamics for the 8-strain model exhibiting oscillatory behavior. Here we assumed that the least-fit strain at 
the within host level is the most transmissible {Gj, with G m j n = 1, G ma x = 5, see equation (11)), and that gi = land gg = 1.1. The system 
is run for 200 years to show that the oscillations are stable. A careful examination of the next generation matrix reveals that type-1 
individuals (i.e., those infected by the least virulent strain) can alone drive a self-sustaining epidemic, because element A of the next 
generation matrix is larger than 1. Other intermediate strains are also able to self-sustain but, irrespective of the starting conditions, 
sooner or later type-1 individuals dominate the epidemic because /tu > k/j for all /'. However, after their long asymptomatic period, 
the within-host dynamics will have selected for the most virulent strain (strain 8), which is transmitted during the AIDS phase. This 
generates a substantial number of type-8 individuals, which consume the susceptible population fast enough to stop the type-1 epidemic 
and quickly die because of their short infectious life and the inability to sustain an epidemic (/c 8 8 < 1). Further helped by the death of 
the type-1 individuals that reached the end of their infectious period, the overall prevalence collapses and, in a stochastic model, we 
would observe extinction of the viral population. However, in our deterministic model, the prevalence of type-1 individuals lingers at 
extremely low levels (the "attofox" problem of Mollison 1991) until the susceptible population grows enough to trigger another type-1 
epidemic, and this generates the observed oscillations. Care has been put in choosing a sufficiently small time-step (dt — 0.005) for Euler's 
integration method not to yield negative results during the prevalence troughs. 



shown). It is interesting to note that if we assume a hill-climb 
within-host fitness landscape, the equilibrium consists of multiple 
strains circulating between individuals (Fig. 4A), whereas if we 
assume a rugged fitness landscape, where all strains are equally 
likely to mutate into all other strains, strains are less likely to 
coexist (Fig. 4B). A clear understanding of these patterns can 
be derived by examination of the next generation matrix, K (not 
shown). 

Recent evidence suggests that some strains of HIV-1 are 
more transmissible than others (Gnanakaran et al. 201 1 ; Go et al. 
2011), and we therefore consider the impact that allowing some 
strains to be inherently more transmissible than others has on 
the equilibria. We first consider the scenario where the relative 
transmissibilities of the strains are evenly distributed on a linear 
scale, and where the fittest strains at the within-host level are also 
the more transmissible: 



G+ 



+ 



-G^Xi-l) 



+ 



,)0-D ' 



(10) 



Throughout, G m i n = 1 and G max = 5, and the denominator 
is chosen to rescale the G, 's to have mean 1 . 

As expected, infections tend to be initiated by more virulent 
strains than when all strains are equally transmissible (Fig. 5A), 



although the effect is fairly small (compare to Fig. 4A). The 
situation is not as clear-cut when the least-fit strain at the within- 
host level is most transmissible: 



+ 



22 j Gmin + 



(G„ 



-Gmm)(n-7) ' 



(11) 



For small values of g max , less virulent strains tend to dom- 
inate at the population level compared to when all strains are 
equally transmissible (compare Figs. 4A, 5B), as might be ex- 
pected. However, for higher values of g max (i.e., where the within- 
host dynamics are faster) we see a switch to the least virulent 
strain dominating new infections, with a small, but not negligible, 
number of new infections initiated by the most virulent strain. 
Numerical simulations (Fig. 6) reveal that this equilibrium is not 
attractive and the solution of the full dynamics exhibits stable 
oscillations between the most virulent and least virulent strain 
dominating the population. However, these oscillations are an un- 
realistic consequence of the deterministic nature of our model, 
often referred to as the "attofox" phenomenon (Mollison 1991; 
see Fig. 6 for a more detailed description of the dynamics in this 
unrealistic scenario). In a stochastic model, the viral population 
would be expected to go extinct after the first epidemic wave. This 
is because short-sighted evolution selects for the fittest within-host 
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strain, but this strain is unable to drive a self-sustaining epidemic 
due to its inefficiency at transmitting between hosts. 

Con elusions/Discussion 

We have presented a general framework allowing researchers 
to model multistrain within-between host models of pathogen 
evolution, and demonstrate the utility of the approach by studying 
the evolution of virulence in HIV infection. If there is little 
within-host evolution, because of limited competition between 
strains, a large number of competing strains, and/or a within-host 
adaptive landscape that is difficult to traverse, virulence will 
evolve toward an intermediate level that maximizes the trans- 
mission potential of infections and the number of hosts infected. 
However, if there are greater opportunities for within-host 
selection, virulence is expected to evolve toward a higher level 
even though this reduces the transmission potential of infections 
and the number of hosts infected. 

Data suggest that the replicative capacity of HIV tends to 
slowly increase during the course of infections, but by a relatively 
modest amount compared to the variation in replicative capacities 
found at the population level (Troyer et al. 2005; Kouyos et al. 
2011). If replicative capacities do increase by the small amount 
suggested in these studies, that is where the virulence of the strain 
an individual is infected with is similar to the virulence of the 
strain that individual tends to transmit, our model predicts that 
HIV will evolve a level of virulence very close to the level that 
will maximize the transmission potential of the virus. According 
to a recent meta-analysis, HIV virulence has increased over the 
past two decades, but the upward trend has plateaued off in the 
last few years (Herbeck et al. 2011). Because current levels of 
HIV virulence maximize the transmission potential of the virus 
(Fraser et al. 2007; Shirreff et al. 201 1), we predict that HIV is 
unlikely to get much more virulent, if at all, in years to come. 

Of course, it is interesting to wonder why the replicative 
capacity does not increase much more rapidly during the course 
of infection than it appears to do. One possibility is that the within 
host adaptive landscape is extremely large, rugged, and difficult to 
traverse (Kouyos et al. 2012). In such a situation, the within-host 
viral population will only be able to explore a small comer of the 
landscape and as a result between-host selection pressures will 
overshadow within-host evolution. In addition, the host immune 
system is likely to have a significant role. Although evidence 
suggests that the intrinsic replicative capacity (by which we mean 
the replicative capacity as measured in vitro) of the strain of 
HIV an individual is infected with determines SPVL (Quinones- 
Mateu et al. 2000; Trkola et al. 2003; Daar et al. 2005; Joos et al. 
2005; Kouyos et al. 2011), this might not be correlated with the 
ability of the virus to replicate in the face of an adaptive immune 
response, that is the realized replicative capacity of the virus 



during the course of infection. For example, strains harboring 
CTL escape mutations often have reduced in vitro replicative 
capacities, but will be under positive selection during the course 
of an infection (Mostowy et al. 2012). Consequently, although 
the intrinsic replicative capacity of the infecting viral strain will 
influence the transmission potential of the infection, this intrinsic 
replicative capacity might have only a small influence on the 
within-host dynamics. On a related issue, it is worth noting the 
small body of evidence showing that ancestral strains of HIV (i.e., 
those that initiate infections) are stored in memory T cells and then 
expressed and preferentially transmitted over strains circulating 
later in infection (Lythgoe and Fraser 2012; Redd et al. 2012a). 
Even if the reproduction rate of viruses increases substantially 
during the course of infection, transmission of ancestral strains 
would effectively by-pass this within-host evolution, ultimately 
favoring strains with the highest transmission potential. 

As a final point, we reiterate that we have made some strong 
assumptions in this model. First, we have assumed here that all 
hosts are identical. However, we know that host genetic factors 
affect the ability of individuals to control HIV infection in terms of 
SPVL and duration of infection (Fellay et al. 2009). If the relative 
ranking of viral strains, in terms of their replicative capacity and 
virulence, tends to be the same in all hosts, we do not expect host 
heterogeneity to drastically alter our conclusions: we would still 
expect short-sighted within-host evolution to drive up virulence 
at the epidemiological level. However, if the situation is more 
complex and the relative ranking is different in different hosts, for 
example due to HLA heterogeneity, it is not immediately clear 
how this will affect the evolution of the viral population, and 
this is an important area of future study. Second, we have used 
a very idealized quasi-species approximation to model within- 
host viral dynamics, which is clearly an oversimplification; for 
example, new research is now enabling us to gain insight into the 
selection pressures faced by the virus at the very earliest stages of 
infection (Bar et al. 2012). Substituting the quasi-species model 
for a mechanistic within-host model that incorporates some of this 
complexity might provide useful insights, although the difficulty 
will be in keeping such models simple enough that they are still 
tractable. Finally, it is important to realize that the model presented 
here is deterministic. We would expect a stochastic version of the 
model to slow the rate of within-host evolution, thus tilting the 
balance toward strains that have a higher transmission potential, 
but it is not clear how strong this effect will be. 

Predicting how virulence, or any other trait, is likely to evolve 
in any particular system is difficult because we have two levels 
of selection to consider, within hosts and between hosts, and of- 
ten there are a large number of circulating strains (Shankarappa 
et al. 1999) that need to be considered. Here we have constructed 
a framework that allows researchers to link these two levels of 
selection and to accommodate a large number of pathogen strains. 
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Although we have focused this framework on HIV, with a very 
simplistic model of within-host evolution, the modeling frame- 
work is general enough that it can be easily adapted to fit a broad 
range of situations in which one wants to model multiple strains 
circulating at the within- and between-host levels. 
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